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Abstract 

In this paper, using spectral theory of Hilbertian operators, we study ARM A Gaus- 
sian processes indexed by graphs. We extend Whittle maximum hkehhood estimation 
of the parameters for the corresponding spectral density and show their asymptotic 
optimahty. 
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Introduction 



In the past few years, much interest has been paid to the study of random fields over graphs. 
It has been driven by the growing needs for both theoretical and practical results for data 
indexed by graphs. On the one hand, the definition of graphical models by J.N. Darroch, 
S.L. Lauritzen and T.P. Speed in 1980 [8] fostered new interest in Markov fields, and many 
tools have been developed in this direction (see, for instance [23] and [22]). On the another 
hand, the industrial demand linked to graphical problems has risen with the apparition of 
new technologies. In very particular, the Internet and social networks provide a huge field 
of applications, but biology, economy, geography or image analysis also benefit from models 
taking into account a graph structure. 

The analysis of road traffic is at the root of this work. Actually, prediction of road traffic 
deals with the forecast of speed of vehicles which may be seen as a spatial random field over 
the traffic network. Some work has been done without taking into account the particular 
graph structure of the speed process (see for example [10] and [16] for related statistical 
issues). In this paper, we build a new model for Gaussian random fields over graphs and 
study statistical properties of such stochastic processes. 

A random field over a graph is a spatial process indexed by the vertices of a graph, namely 
(Xj)jgG; where G is a given graph. Many models already exist in the probabilistic literature, 
ranging from Markov fields to autoregressive processes, which are based on two general kinds 
of construction. On the one hand, graphical models are defined as Markov fields (see for 
instance [H]), with a particular dependency structure. Actually, they are built by specifying 
a dependency structure for Xi and Xj, conditionally to the other variables, as soon as the 
locations i E G and j E G are connected. For graphical models, we refer for instance to [8j 
and references therein. On the other hand, the graph itself, through the adjacency operator, 
can provide the dependency. This is the case, for example, of autoregressive models on Z*^ 
(see |14j). Here, the local form of the graph is strongly used for statistical inference. 

More precisely, the usual purpose of graphical models is to design an underlying graph 
which reflects the dependency of the data. This method has to be applied when this graph 
is not easily known (for instance social networks) or when it plays the role of a model which 
helps understanding the correlations between high complex data (for instance for biological 
purpose). Our approach differs since, in our case, the graph is known, and we aim at using 
a model with stationary properties. Indeed, in the case of road traffic, we can consider 
that the correlations of the process depend mainly on the local structure of the network. 
This assumption is commonly accepted among professionals of road trafficking speaking of 
capacity of the road. 

In this paper, we extend some classical results from time series to spatial fields over 
general graphs and provide a new definition for regular ARM A processes on graphs. For 
this, we will make use of spectral analysis and extend to our framework some classical results 
of time series. In particular, the notion of spectral density may be extended to graphs. 
This will enable us to construct a maximum likelihood estimate for parametric models of 
spectral densities. This also leads to an extension of the Whittle's approximation (see [I^ . 



2 



[2]). Actually, many extensions of this approximation have been performed, even in non- 
stationary cases (see [7], [12], [H])- The extension studied here concerns general ARM A 
processes over graphs. We point out that we will compare throughout all the paper our new 
framework with the case G = Z'^, d > 1. 

Section [1] is devoted to some definitions of graphs and spectral theory for time series. 
Then we state the definition of general ARM A processes over a graph in Section [2j The 
convergence of the Whittle maximum likelihood estimate and its asymptotic efficiency are 
given in Theorems 13.11 and 13.21 in Section [31 Section H] is devoted to a short discussion on 
potential applications and perspectives. Some simulations are provided in Section [51 The 
last section provides all necessary tools to prove the main theorems, in particular Szego's 
Lemmas for graphs are given in Section 16. 1[ while the proofs of the technical Lemmas are 
postponed in Section [6^31 

1 Definitions and useful properties for spectral analy- 
sis and Toeplitz operators 

1.1 Graphs, adjacency operator, and spectral representation 

In the whole paper, we will consider a Gaussian spatial process (^j)jeG indexed by the 
vertices of an infinite undirected weighted graph. 
We will call G = {G, W) this graph, where 

• G is the set of vertices. G is said to be infinite as soon as G is infinite (but countable). 

• We [—1,1]'^^'^ is the symmetric weighted adjacency operator. That is, \Wij\ ^ 
when i E G and j & G are connected. 

We assume that W is symmetric {Wij = Wji, i,j G G) since we deal only with undirected 
graphs. 

For any vertex i & G, a vertex j G G is said to be a neighbor of i if, and only if, Wij ^ 0. 
The degree deg{i) of i is the number of neighbors of the vertex i, and the degree of the graph 
G is defined as the maximum degree of the vertices of the graph G : 

deg(G) := maxdeg(z). 

From now on, we assume that the degree of the graph G is bounded : 

deg(G) < +00. 

Assume now that W is renormalized : its entries belong to dcg(G) ' deg^(G) ]' '^^^^ 
restrictive since re-normalizing the adjacency operator does not change the objects intro- 
duced later. In particular, the spectral representation of Hilbertian operator is not sensitive 
to a renormalization. 

Notice that in the classical case G = Z, the renormalized adjacency operator is 

= ^%_,l=i},(z,jGZ). (1) 
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Here, deg(Z) = 2. This case will be used in all the paper as an illustration example. 
To introduce the spectral decomposition, consider the action of the adjacency operator 
on P{G) as 

V« G f{G), {Wu)i := ^ii^i' e G). 

We denote by Bq the set of all bounded Hilbertian operators on 1^{G) (the set of square 
sommable real sequences indexed by G). The operator space Bq will be endowed with the 
classical operator norm 

WAeBcWAW^ := sup ||^w||2, 

ue/2(G),||«||2<i 

where ||.||2 stands for the usual norm on P{G). 

Notice that, as the degree of G and the entries of W are both bounded, W lies in Bq, 
and we have 

\\W\\2,or><^- 

Recall that for any bounded Hilbertian operator A G Be, the spectrum Sp(yl) is defined 
as the set of all complex numbers A such that A Id —A is not invertible (here Id stands for 
the identity on P{G)). Since W is bounded and symmetric, Sp(VF) is a non-empty compact 
subset of M [20]. 

We aim now at providing a spectral representation of any bounded normal Hilbertian 
operator. For this, first recall the definition of a resolution of identity (see for example |20]): 

Definition 1.1. Let Ai be a a-algebra over a set Q. We call identity resolution (on Ai) a 
map 

E:M^Bg 

such that, 

1. Bids) = 0,E{n) = I. 

2. For any u E Ai, the operator E{u) is a projection operator. 

3. For any u,u' E Ai, we have 

E{uj n u) = E{u)E{uj') = E{u')E{uj). 

4- For any u!,u' E Ai such that u H u' = ^, we have 

E{uUu') = E{u) + E{u'). 

We can now recall the fundamental decomposition theorem (see for example |20] ) 

Theorem 1.1 (Spectral decomposition). If A E Bq is symmetric, then there exists a unique 
identity resolution E over all Borelian subsets ofSp{A), such that 

A= I ME{\). 

Jsp{A) 
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From the last theorem, we obtain the spectral representation of the adjacency operator 
W thanks to an identity resolution E over the Borelians of Sp(W) 

W = [ XdE{X). 

Obviously, we have 

= I AM£;(A),A; G N. 

JSp{W) 

Define now, for any i E G, the sequences 6i in P{G) by 

Si '■= (lfc=i)fceG- 

For any i,j G G, the sequences Si and Sj define the real measure fiij by 

Va; C Sp{W),Hij{u) := {E{u)Si, Sj)i2(^G)- 

Hence, we can write : 

Wk eN,Wz, J eG, = [ AM/i,,. 

JSp{W) 

This family of measures G G will be used in the whole paper. They convey 

both spectral information of the adjacency operator, and combinatorial information on the 
number of path and loops in G. Indeed, the quantity (W'')._. is the number of path (counted 
with their weights) going from i to j with length k. 

Note also that all diagonals measures fj,ii,i G G are probability measures. 

1.2 The adjacency operator of Z and its spectral decomposition 

In the usual case of Z, an explicit expression for /ij^ can be given. 

Denote Tk{X) the /c^'^-Chebychev polynomial {k G N). We can provide the spectral 
decomposition of ly(^) (w(^) has been defined in Equation [1]). 

V.,jGZ,((P^(-))^) =i/ A'^^^dA. 

This shows that, in this case, and for any i,j G G, the measure dfiij is absolutely continuous 
with respect to the Lebesgue measure, and its density is given by 

d/ijj _ 1 7|j-i|(A) 

dA ~ TTy/T^' 

Notice that we recover the usual spectral decomposition pushing forward fiij by the 
function cos : ^ 

Vz, j G G, dfiijit) := — cos ((j - i)t) dt. 

We get 

Vz,jGZ,((M^(^))') =/ cos(t)MA.,(t). 

^ J[0,27r] 
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1.3 Time series, spectral representation, and MA^o 

Our aim is to study some kind of stationary processes indexed by the vertices G of the graph 
G. To begin with, let us recall the usual case of Z. In particular, let us introduce Toeplitz 
operators associated to stationary time series. 

Let X = {Xi)i^z be a stationary Gaussian process indexed by Z. Since X is Gaussian, 
stationarity is equivalent to second order stationarity, that is, Vz, e Z, Cov(Xj, Xj+fc) does 
not depend on i. Thus, we can define 

rfc := Cov{Xi,Xi+k)- 

Aassume further that {rk)kei. £ /^(Z). This leads to a particular form of the covariance 
operator F defined on Z^(Z) by 

e Z,Tij := ri_j. 

Recall that denotes here the set of bounded Hilbertian operators on /^(Z). Notice that, 
since {rk)kez £ ^^{'Z), we have T E Bz (see for instance [S] for more details). This bounded 
operator is constant over each diagonals, and is therefore called a Toeplitz operator (see also 
[4j for a general introduction to Toeplitz operators). 

As (rfc)fc6Z e /^(Z), we have 

Vi, j e Z, T{g)ij := Tij = ^ I g{t) cos {{i - j)t) dt, 

where g is the spectral density of the process X, defined by 

g{t) := 2 ^ rkCOs{kt) + tq. 

km* 

This expression can be written, using the Chebychev polynomials {Tk)keN, 

g{t) := 2 ^ rfcTfc (cos(f)) + ToTq (cos(t)) . 

km* 

Let, for A G [-1,1], 

/(A) :=2^r,T,,(A) + roTo(A). (2) 

feeN* 

We get, using the family (/iij)jjez defined above, 

yij G Z, Tij = f (cos(t)) dfiij{t). 

7[0,27r] 

Notice that the last expression may also be written as F = /(VF^^-*), and the convergence 
of the operator valued series defined by Equation|2]is ensured by the boundedness of W^^^ and 
of the Chebychev polynomials {Tk{[—1, 1]) C [—1, 1], Vfc G Z), together with the summability 
of the sequence {rk)k(^z- 

We will extend usual MA processes to any graph, using this previous remark. This will 
be the purpose of Section |2J 
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Let us recall some properties about the moving average representation MA^o of a process 
on Z. This representation exists as soon as the log of the spectral density is integrable (see 
for instance ^). In this case, there exists a sequence {ak)k£N, with oq = 1, and a Gaussian 
white noise e = {ek)kez-, such that the process X may be written as 

Vi eZ,Xi = y^gfcej-fc. 

fcGN 

Defining the function h over the unit circle C by 

feeN 

we recover, with a few computations, the spectral decomposition of the covariance operator 
r of X : 

Vz,iGZ,r,, = / \h{e'')\'dft,,{t). 

i[0,27r] 

This implies the equality 

/(cos(t)) = |Me^*)|'. 

Recall that when his a polynomial of degree p (with non null first coefficient), the process 
is said to be MAp. In this case, / is also a polynomial of degree p. Reciprocically, if / is a real 
polynomial of degree p, and as soon as / (cos(t)) is even, and non-negative for any t G [0, 27r], 
the Fejer-Riesz theorem provides a factorization of / (cos(t)) such that / (cos(t)) = \h{e^'^)f 
(see for instance [15]). This proves that X is MAp if, and only if, its covariance operator 
may be written f{W^'^^), where / is a polynomial of degree p. 

This remark is fundamental for the construction we provide in the following section (see 
Definition EH). 

1.4 Whittle maximum likelihood estimation for time series 

Here, we recall briefiy the Whittle's approximation for time series. Let 6 be a compact 
interval of M.'^,d > 1, and {fe)eee be a parametric family of spectral densities. Let Oq G 0, 
and assume that (Xj)jgz is a Gaussian time series whith spectral density fo^. 

If we observe X„ := (Xj)j=i n > 0, we can define the maximum lokelihood estimate 
9n of 6^0 as: 

9n := argmaxL„(6',X„), 

where 

Ki9,y.n) ■■= -\ (nlog(27r) + logdet (r„(/e)) + X^(r„(/e))"'x„) . 

This estimator is consistent as soon as the spectral densities are regular enough, and under 
assumptions on the function 9 ^ fe (see for instance [2]). However, in practical situations, 
it is hard to compute. The Whittle's estimate is built by maximizing an approximation of 
the likelihood instead of the likelihood itself: 

9n := argmaxL„(6',X„), 
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where 

L„(^,X„):=-jfnlog(27r) + n / log (/e(A)) dA + X^r„(^)X 

^ \ J[0,27r] Je 

The Whittle estimate is also consistent and asymptotically normal and efficient, as soon 
as the spectral densities are regular enough. 

The consistency of the Whittle estimate relies on the Szego's Lemma, which provide 
a bound on the error between Mog det (7^(/6))) and Jj^ log (/^(A)). There exists many 
versions of this Lemma (see for instance [2j, |12]). 

In this work, we are interested in a weak version given by Azencott and Dacunha-Castelle 
in [2]. The lemma relies on the following fondamental inequality: Let /(x) = XlfceN 
di^) = J2keN9kx'' be two analytics function on the complex unitar disk. Then we have 

<lj2^k + l)f,J2(k + l)9k. (3) 

keN keN 

In the following, we aim at developing the same kind of tools for processes indexed by a 
graph. 



rN{f)rN{g)-rN{fg) 



2 Spectral definition of ARM A processes 

In this section, we will define moving average and autoregressive processes over the graph 
G. 

As explained in the last section, since W is bounded and self-adjoint, Sp(VF) is a non- 
empty compact subspace of R, and W admits a spectral decomposition thanks to an identity 
resolution E, given by 

W = [ XdE{\). 

Jsp{W) 

We define here MA and AR Gaussian processes, with respect to the operator W, by 
defining the corresponding classes of covariance operators, since the covariance operator 
fully characterizes any Gaussian process. 

Definition 2.1. Let {Xi)i^G be a Gaussian process, indexed by the vertices G of the graph 
G, and V its covariance operator. 

If there exists an analytic function f defined on the convex hull o/Sp(H^), such that 

T = [ f{X)dE{X), 

JSp{W) 

we will say that X is 

• MAq if f is a polynomial of degree q. 

• ARp if J is a polynomial of degree p which has no root in the convex hull of Sp(W). 

• ARMAp^q if f = ^ with P a polynomial of degree p and Q a polynomial of degree q 
with no roots in the convex hull ofSp{W). 
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Otherwise, we will talk about the MA^ representation of the process X. We call f the 
spectral density of the process X, and denote its corresponding covariance operator by 

r = /c(/). 

Remark Actually, this last construction may also be understood as 

T = IC{f) = f{W), 

in the sense of normal convergence of the associated power series. However, the spectral 
representation will be useful in the following. Even if we consider only regular processes in 
this works, the definition using the spectral representation allows weaker regularity than the 
definition using the normal convergence of the associated power series. 

This kind of modeling is interesting when the interactions are locally propagated (that 
may be for instance a good modeling for traffic problems.). 

The notation /C(.) has to be understood by analogy with the notation T(.) used for 
Toeplitz operators. 

Notice that, in the usual case of Z, and for finite order ARM A, we recover the usual 
definition as shown in Subsection 11.31 So, the last definition may be seen as an extension 
of isotropic ARM A for any graph G. Besides, note that this extension is given by the 
equivalence, for any g ELi^ ([0, 2tt]), such that Jj^ log(g') < +00, 

V/ G 1]), {g = f (cos(t)) ^ T{g) = /C(/)) . 

This means that, in the usual case G = Z, the definition of spectral density in our framework 
is the usual one, up to an change of variable A = cos(t) (see Subection II. 3p . 

Now, we get a representation of moving average processes over any graph G. The fol- 
lowing section gives the main result of this paper. It deals with the maximum likelihood 
identification. 

3 Convergence of maximum approximated likelihood 
estimators 

In this section as before, G = (G, W) is a graph with bounded degree. Let also (Xj)jgG be 
a Gaussian spatial process indexed by the vertices of G with spectral density foQ (defined in 
Section [2]) depending on an unknown parameter 60 G O. We aim at estimating 6o- For this, 
we will generalize classical maximum likelihood estimation of time series. 

We will also develop a Whittle's approximation for ARM A processes indexed by the 
vertices of a graph. We follow here the guidelines of the proof given in |2j for the usual case 
of time series. 
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3.1 Framework and Assumptions 

Let us now specify the framework of our study. Let (G„)neN be a growing sequence of finite 
nested subgraphs. This means that if = (Gn, Wn), we have Gn C Gn+i C G and that for 
any i,j G Gn, it holds that Wn{i,j) = W{i,j). 
Let m„ = Card(G„). We set also 

6n = Card {i G G„, 3j G G'\G„, W,, ^ 0} . 

The sequence may actually be seen as the "volume" of the graph G„, and Sn 

as the size of the boundary of G„. For the special case G = and G„ = [— n, n]'^, we get 
m„ = (2n + 1)'=' and = 2d{2n + 1)'^-^ 

The ratio ^ is a natural quantity associated to the expansion of the graph that also 
appears in isoperimetrical p!8] and graph expander issues. We will assume here that this 
ratio goes to when the size of the graph goes to infinity. In short, we set 

Assumption 3.1. 5n = o{mn) 

This assumption is a non-expansion criterion. The graph has to be amenable, which is 
satisfied for the last examples G = Z"^ and Gn = [—n,nY, but not for a homogeneous tree, 
whatever the choice of the sequence of subgraphs (Gn)neN is. 

We will now choose a parametric family of covariance operators of MA processes as 
defined in the last section. First, let be a compact interval of M. 

We point out that for sake of simplicity, we choose a one- dimensional parameter space 
6. Nevertheless, all the results could be easily extended to the case Q C , k > 1. 

Define J-' as the set of positive analytic functions over the convex hull of Sp(VF). 

Let also {fe)eGe be a parametric family of functions of J-". They define a parametric set 
of covariances on G (see Section [2]) by 

]C{fe) = fe{W). 

As in |2j, we will need a strong regularity for this family of spectral densities. 
Let us introduce a regularity factor for any analytic function 

feTJix) = J2fk^' (xGSp(M^)), 

k 

by setting 

aif):=Y,\Mik + l). (4) 

km 

Now, let p > and define, 

J-,:={/G J-,a(log(/))<p}. (5) 

Notice that for any / G J^p, we have «(/) < e^, «( j) < e''. 
We need the following assumption 

Assumption 3.2. 
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• The map 9 fg is injective. 

• For any A G Sp{W), the map 9 — t- /^(A) is continuous. 

• y9ee,feeTp . 

From now on, consider 6*0 G 6. Let X be a centered Gaussian MA^ process over G with 
covariance operator /C(/0o) (see Section [2]). 

We observe the restriction of this process on the subgraph G„ defined before. Our aim 
is to compute the maximum hkehhood estimator of 9q. Let X„ = (Xi)^^^,^ be the observed 
process and JCnifd) be its covariance : 

X„~Ar(0,/C„(/eJ). 

The corresponding log-hkehhood at 9 is 

Ln{9) := (m„log(27r) +logdet (/C„(/,)) + Xj(/C„(/e))''x„) . 

As discussed before, in the case G = Z, it is usual to maximize an approximation of the 
hkehhood. The classical approximation is the Whittle's one (|12j), where 

-log det {Tn{g)) 
n 

is replaced by 

^[ \og{g{t))dt. 

Back to the general case, we aim at performing the same kind of approximation. For 
this, we will need the following assumption to ensure the convergence of log det {}Cn{fe)) (see 
Section [1] for the definition of ^u) : 

Assumption 3.3. There exists a positive measure fi, such that 

1 V 

Here, V stands for the convergence in distribution 

The limit measure ^ is classically called the spectral measure of G with respect to the 
sequence of subgraphs (G„)ngz (see |T7] for example). 

Actually, under Assumption 13. ![ Assumption 13.31 is equivalent to the convergence of the 
empirical distribution of eigenvalues of Wq^ (here, Wg„ denotes the restriction of W over the 
subgraph Gn) That is, if A["\ ■ ■ ■ , Ami denote the eigenvalues (written with their multiplicity 
orders) of Wg^ , Define 
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and ^ 



Then, under Assumption 13.11 the convergence of ji^n to (i.e. Assumption 13. 3p is equiv- 

[21 

alent to the convergence of to /i. 

To prove this equivalence we just have to notice that : 

A'd/.«(A) - / A'=d^(f)(A) 

Sp(H/) JSp{W) 

= — Tr((l^Gj^-)-— Tr((W^%J. 
rrin ' nin ^ 

So that, we get the result by Lemma EH] (see Section EH]) . 

As in the case of time series (for G = Z), we can approximate the log-likelihood. It avoids 
an inversion of a matrix and a computation of a determinant. Indeed, we will consider the 
two following approximations. 

Ln{e) := (^mn\og{27T) + mnJ log(/e(x))d/x(x) + (/C„(/,))-' X„ 

Ln{0) := (^m„log(27r) + j log(/e(x))d^(x) + (^/C„ (^j^^ . 

Notice that approximated maximum likelihood estimators are not asymptotically normal 
in general (see for instance [13] for Z'^). Indeed, the score associated to the approximated 
log-likelihood has to be asymptotically unbiased [2]. 

To overcome this problem in Z'^, the tapered periodogram can be used (see [H], [13], [6]). 

Let us consider graph extensions of standard time series models : 

• The MAp case : There exists P > such that the true spectral density fe^ is a poly- 
nomial of degree bounded by P. 

• The ARp case : There exists P > such that all the spectral densities (for any 9 & Q) 
of the parametric set are such that ^ is a polynomial of degree bounded by P. 

So, to define the good approximated log-likelihood, we first introduce the unbiased peri- 
odogram in each of the last cases. Now, let P > 0. 
Define a subset Vp of signed measures on R as 

Vp := {lJ,ij,i,j e G,dGii,j) < P} , 

where dG{i,j),i,j G G stands for the usual distance on the graph G, i.e. the length of the 
shortest path going from i to j. 

We will need the following assumption 
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Assumption 3.4. The set Vp of possible local measures over G is finite, and n is large 
enough to ensure that 

\/v e Vp,3(z,j) e Gl,^iij = V. 

Remark This assumption is quite strong, and holds for instance for quasi-transitive graphs 
(i.e. such that the quotient of the graph with its automorphism group is finite). This 
assumption may be relaxed, but it is a hard and technical work that will be the issue of a 
forthcoming paper. 

Define now the matrix S^"-* (the dependency on P is omitted, for clarity) by 



^ Card {(A;, Q e x G, //^ = /i^j} (h h < P 

"^'^ • Card {(fc, /) G G„ X G^, ^xu = /x,,} ' ' ' 
1 if dcikj) > P. 



The matrix B^"''^ gives a boundary correction, comparing, for any f G Vp the frequency 
of the interior couples of vertices with local measure v with the boundary couples of vertices 
with local measure v. Actually, this way to deal with the edge effect is very similar to the 
one used for G = Z"^ (see [0], [13] )• 

As example, let us now describe the case G = Z^, for P = 2. In this case W^'^ ^ is 

\/i,j,k,ie z,iy(^') {{t,j),{k,i)) := 

In this example, we set Gn = [l,n\'^, and we can compute the matrix B^^\ Indeed, it 
only is needed to notice that 

^(nji),(n+fc,ii+0 = /^(i2,i2),(i2+eifc,j2+e20' ^1' ^2, jl,i2, A;, / G Z, 61,62 G {-1,1}. 

This means that the local measure of a couple of vertices depends only of their relative 
positions (stationarity and isotropy of this set of measure). So, we need to count the con- 
figurations given by Figure [1] since we consider only couples of vertices u,v & 7? such that 
{u,v) < 2. 

We get, for any i,j G Z, 
. B^^l ^, = 4 = 1. 



rj{n) _ jj{n) _ 4n(7i-l) 

_ 4(n-l)2 
^ii,j),ii±l,j±l) n2 • 

_ d(") _ 4re(?i-2) 

^{Lj),ii,j±2) ^{i,j),{i±2,j) 4n2 



One can notice that 



sup 



- 1 



^ 0. 

n— >oo 



^3 

Assumption 13.51 ensure that this property holds for the graph we consider. 
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Figure 1: Possible configurations for couple of vertices 



Back to the general case, let / G J-p. We define the unbiased periodogram as 
where 

:= S(")0/C„(/). 

Here, the operation denotes the Hadamard product for matrices, that is 

Notice that this is actually a way to extend the so called tapered periodogram (see for 
instance [T3]). 

We now define the unbiased empirical log-likelihood, for any 9 E Q 

L^:\0) ■■= -\ (^m„ log(27r) + m„y" log(/,(x))d/i(x) + {O-nij)^ . 

We denote by ^„,, the maximum likelihood estimators associated to L„, L„, 

L„, L^r\ respectively. 

We will need the following assumption. 

Assumption 3.5. There exists a positive sequence {un)nm such that, 



Un 0, 

n— >oo 



and 



sup 



Notice that the last assumption holds for example in the case G = Z'', d > 1. 
To prove asymptotic normality and efficiency of the estimator 9'^\ we will also need the 
following assumption. 
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Assumption 3.6. Assume that 



There exists a positive sequence 



such that Vr, 



and 



V/ G J-p, 



— Tr(/CG„(/))- / /d^ 



< Ol{f)Vn. 



For any 6 E Q, fe is twice differentiable on 6 and 



The first assumption means that the convergence of the empirical distribution of eigen- 
values of A^(/) to the spectral measure is faster than ;^7=- It holds for instance for 
quasi-transitives graphs, with a suitable sequence of subgraphs. The second assumption is 
more classical. For example it is required in the case G = Z (see [2]). 



3.2 Convergence and asymptotic optimality 

Let p > 0. We can now state one of our main result: 

Theorem 3.1. Under Assumptions\3Jl\EMand\EM the sequences (^n)neN; (^n)neN; (^n)n6N 
converge, as n goes to infinity, Pj^^-a.s. to the true value 9q. If moreover Assumption \3.5\ 

holds, this is also true for (6'i"^)riGN- 

Proof. The proof follows the guidelines of [2]. We highlight the main changes performed 
here. First, we define the Kullback information on (?„ of /eg with respect to / G J^p, by 

and the asymptotic Kullback information (on G) by 

«(/,„,/) =lim—IK„(/eo,/) 

whenever it is finite. 

The convergence of the estimators of the maximum approximated likelihood is a direct 
consequence of the following lemmas : 

Lemma 3.1. For any f G Tp, and under Assumptions \3.1\ \3.2\ and \3.3\ the asymptotic 
Kullback information exists and may be written as 

mfeoJ) = IJ{- iog(y ) - 1 + y) ^-"^ 

Furthermore, if we set ln{0,Xn) = :^Ln{0, Xn) , we have that Pf^^-a.s., 

n—^oo 
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log( 



uniformly in 6 G Q. 

This property also holds for In := j^Ln and In '■= :;^Ln 

Furthermore, for P > 0, and for both the ARp or the MAp case (see above), this also 
holds for l^:^ :=^Li"). 

Lemma 3.2. Let fo^ be the true spectral density, and {in)n&i be a deterministic sequence of 
continuous functions such that 

\/eee,en{Oo)-UO) ^ mieoJe) 

uniformly as n tends to infinity. Then, if 6n = argmaxg ^„(6'), we have 

dn do- 



The proofs of these lemmas are postponed in Appendix (Subsection 16.21) . 



□ 



Theorem 3.2. In both the ARp or MAp cases, and and under all previous assumptions 
l3l\ [23 E3, the estimator Ot^ of 9q is asymptotically normal: 



-1 



Furthermore, the Fisher information of the model is 

2 



fe 



Bo 



Hence, the previous estimator is asymptoticly efficient. 

Proof. Here again, we mimic the usual proof by extending the result of [2] to the graph case. 
Using a Taylor expansion, we get 



where G 



So that, 



/)(«) n 



. As = argmax/i"\ we have 



The end of the proof rehes on three lemmas : 

Lemma [373] provides the asymptotic normality for y/rn^{lli^)'{9o)- Combined with Lemma 
13. 4[ we get the asymptotic normality for y/rfi^{9o — 9li^). Finally, Lemma [375] gives the Fisher 
information. 



Lemma 3.3. 



1 



fe, 



d/i 
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Lemma 3.4. 



) 



Lemma 3.5. The asymptotic Fisher information is : 



^'^"^4/ (I) 



The proofs of these lemmas are postponed in Appendix (Subsection I6.3P 



□ 



4 Discussion 

Note first that Theorem 13.11 provides consistency of the estimators under weak conditions on 
the graph. Indeed, amenabihty ensures Assumption 13. 11 for a suitable sequence of subgraphs. 
Assumption 13.31 holds as soon as there is a kind of homogeneity in the graph. The simplest 
application is quasi-transitives graph. Note that if G is "close" to be quasi-transitive, As- 
sumption |33] is still true. We also could adapt notions of unimodularity [1] or stationarity [3] 
to our framework and prove the existence of a spectral measure. Furthermore, Assumption 
13.31 holds for the real traffic network (this will be explained in a forthcomming paper). 

To build the estimator stronger assumptions on the graph G are needed. Let us 
discuss two very special cases. First, Theorem 13.21 may be applied in the case with holes, 
that is in the presence of missing data, up to the condition that they remain few enough. 
Actually, Assumption 13.11 is required, so the boundary of the subgraphs (counting the holes) 
has to be small in front of the volume of this subgraphs. 

We need furthermore a kind of homogeneity for these holes. For instance, we can assume 
that the data are missing completely at random. This particular case is interesting for 
prediction issues. 

Another strong potential application is quasi-transitive graphs, as mentioned above. In- 
deed, take for instance a finite graph (the pattern) and reproduce it at each vertex of an 
infinite (amenable) vertex-transitive graph. The final graph is then quasi-transitive, and all 
the previous assumptions hold. 

This seems to be a natural extension of what happens for 7/. Furthermore, in this 
situation as in Z*^, our work may also be applied to a process with missing values. 

Note also that conditions of both amenability of the graphs and regularity of spectral den- 
sities seem natural, looking at the Szego's Lemmas (see Section [6?T1) . Indeed, the difference 
computed in Lemma 16.11 is only due to edge effects. 

Thus, there are two ways for relaxing this conditions. On the one hand, it could be 
interesting to deal with lower regularity (for instance to study long memory processes) for 
the spectral densities. On the other hand, it could be also interesting to relax conditions on 
the graph, for instance for more regular densities. In particular, we could investigate the case 
of random graphs, and try to pick up homogeneity conditions into the random structure. As 
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mentioned above, another natural extension of this work could be done to graphs "close" to 
be quasi-transitive. 

These two limits of our present work are actually two of our main perspectives in this 
framework. 

5 Simulations 

In this section, we give some simulations over a very simple case, where the graph G is built 
taking some rhombus connected by a simple edge both on the left and right (see Figure [2]). 



Figure 2: Graph G 




The sequence of nested subgraphs chosen here is the growing neighborhood sequence (we 
chose a point x and we take Gn = {y E G, dcr{x, y) < n}). We study an AR2 model, where, 

e = |-i,i|, 

Here, we take for W the adjacency operator of G normalized in order to get supj ^g^. Wij < 
^j^^^. We choose 6*0 = |, m„ = 724. We approximate the spectral measure of G by the 
spectral measure of a very large graph (around 10000 vertices) built in the same way. Figure 
|3] shows the empirical spectrum of the graph G with respect to the sequence of subgraphs 

{Gn)n&i- 

To compute {lCn{fe)Y^ , we use the power series representation of /e, and truncate this 
expression after the 15 first coefficient. This choice ensures that the simulation errors are 
neglectible with respect to the theoretical ones. 

Figure S] gives the empirical distribution of 
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Figure 3: Empirical spectrum 




.JInT|TTTTnTlTllf 



Figure 4: Empirical distribution 
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6 Appendix 



6.1 Szego's Lemmas 

Szego's Lemmas [12] are useful in time series analysis. Indeed, they provide good approxi- 
mations for the likelihood. As explained in Section [3l these approximations of the likelihood 
are easier to compute. 

In this section, we generalize a weak version of the Szego Lemmas, for a general graph, 
under Assumption 13.11 (non expansion criterion for Gn), and Assumption 13.31 (existence of 
the spectral measure jj). 

For any matrix {Bij)i j^Gny we define the block norm 



We can state the equivalent version of the first Szego lemma for time-series 

Lemma 6.1. Asymptotic homomorphism 

Let k,n be positive integers, and let gi, - ■ ■ ,gk be analytic functions over [—1, 1] having 
finite regularity factors (i.e. a{gi) < +oo, i = 1, ■ ■ ■ , k). Then, 

k — 1 

bn {^n{gi) ■ ■■^n{gk) ~ ^nioi ' ' ' Qk)) < -^—a{gi) ■ ■■a{gk). 

Corollary 6.1. For any g ^ Tp (see the first page of Subsection \3. 1\ for the definition) , and 
under Assumptions \3. 1\ and \3.3\. 



— logdet(/C„(5()) [ \og{g)d^. 



Proof, of Lemma [6.11 This proof follows again the one of [2j. We will prove the result by 
induction on k. 

First we deal with the case k = 2. Let / and g analytic functions over [—1, 1] such that 
< +00 and a{g) < +oo. We write 

&„(/C„(/)/C„(^) - ICnifg)) 



E (>^n{f)).k {^n{g)\, - E (^n{f)\k {^n{g)\, 

keGn keG 



)kj\ 



= I E E mu\}c{g)^ 

iJeGn keG\Gn 

Using IC{g) = YlhLo S^W^, Fubini's theorem gives, since all the previous sequences are 
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in l\G), 



s 

i,j&G„ keG\Gr, 



Z"^*^ ieG / h=o " ifceG\G„ jeG„ 

Introducing 

^"--p^ E E !(»•") 



feeG\Gjv jGGjv 



we get 



bn{ICn{f)JCn{g) - ICn{fg)) < SUp ^ |/C(/)ifc | ^ Iflffel A/j. 



'^^'^ ieG fe=0 



The coefficient is a porosity factor. It measures the weight of the paths of length h 
going from the interior of G„ to outside. 
Note that < /i + 1, so we get 

oo 
h=0 

Now, we define another norm on Bq : 

\\B\L,in ^npJ2\Bik\,{BeBG). 



We thus obtain 



oo 



ik 



Finally, we get 



h=0 

oo 

h=0 

oo 

< ElM^=ll/lli,Por 

/i=0 



bn{}CGM)J^GM-'^GM9)) < ll/lll,,o;«(^)- 
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To conclude the proof of the lemma, by symmetrization of the last inequality, and since 
1 < (/i + 1), we have, 

bn (JCnimnig) " /C„(/^)) < ^-a{f)a{g). (6) 
To perform the inductive step, we need the following inequalities [21] : 

aifg) < a{f)a{g), 
bn{BC) < ||5|L,„6„(C), 

bn{B + C) < bniB) + bniC), 
ll^n(/)lloo.n = ll/lll,po. <«(/)• 

Let A; > 1, and assume that for all j < k — 1, Lemma l6.ll holds. Under the previous 
assumptions, and the inductive hypothesis for /c — 1 we get, 

bn{^n{gi)^ ■■■ xlCnigk) - ICnigi- ■ ■ gk)) 

+bn {K^n{gi)fCn{g2 ' ' ' gk) - ^n{gi ■ ■ ■ gk)) 

k — 2 1 
< a(fl'i)^— «(fi'2) ■ • • a{gk) + ^a{gi)a{g2 ■ ■ ■ gk) 

k — 1 

which completes the induction step and proves the result. □ 

Proof, of Corollary 16.11 

Let g G J-),, and be a positive integer. Using Lemma [6. H we have 

TV {lCr.{gf - /C„(/)) < ^br. {lCn{gf - /C„(/)) . (7) 
Thus, we have, thanks to Assumption 13.11 

— Tr(/C„((7)^-/C„(^^)) ^ 0. 
Denote /Xg ^ the real measure whose fc*^-moment is given by 

/ x'=d4il=lim — Tr(/C„((7)'=), 
and fJg^ the real measure whose /c*^-moment is given by 

/ x'=d42]=lini — Tr(/C„(/)). 

Notice that both of these measures have support between inf g > > and sup g < 
< +00, since a{\og{g)) < p (see Section [3]). Therefore, the equality of the moments given 
by Equation [7] gives the equality of the measures /ig ^ and fJg^ . 
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So that, we get 



— log(det(/C„(^?)))-— Tr(/C„,(log((7))) -> 0. 

rUn TUn n-s>+oo 

Assumption 13.31 completes the proof of the Corollary since it implies that 

1 



(8) 



rrir. 



Tr(/C„(log(^))) ^ \og{g)d^i. 



□ 



The following lemma enables to replace fCn{g) by the unbiased version Qn{g) (see Section 
|3]for the definition). 

Lemma 6.2. Under Assumptions \3 . 1^3. 31 \3.4\ and \3.5\. and if f or g is a polynomial having 
degree less than or equal to P, we have 



Tr{{}CM)^n{g)y-{JCnif)Qn{g)y) 



< 2^Una{fYa{gY. 



Proof. We define, for any /, 



fabs{x) = ^ I// 



Actually, the proof is based of the following idea: as soon as / or is a polynomial having 
degree less than or equal to P, we have to control only the number of paths of length less 
than or equal to P (counted with their weights). 

Let p be a positive number. Recall that Qn(^) = B^"-^ /Cn(^) (see Section [3]), we have. 
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1 

m„ 



< 



Tr /C„(/)/C„( 



/C„(/)Q„( 



1 



n ■^LL+i^"'^„)«2;*2i+i^-"(/)i2!+i«2i+2 

ieGn «0=«:«1,--- ,*2p=« '=0---p 



~~ n ^"(^)*2^*2i+i'^n(/)i 



«2i + l*2i+2 



ieG„ io=i, ,i2p=i l=0---p 



< 



m. 



sup 



n B 

i=0---p-l 



(n) 

«2i+l«2i+2 



< 



xE E n 

ieGn *o=*i*ir-- ,*2p=* i=0---p 

1 

m, 



*2! + l*2!+2 



sup 

"-n ii,i2,--- ,«2p+i 



n « 



(n) 

*2i + l*2i+2 



i=0---p-l 



E E n 



abs)i2ii2i+ 



i^n(/, 



abs)i2i+ii2i+2 



< 



< 



sup 

«l,i2,--- ,i2p+i 



sup 

il,i2,--- ,i2p+l 



n a 

i=0---p-l 

n B 

«=0---p-l 

Using Assumption 13.51 we get, 



(n) 

*2! + l*2i+2 



(n) 

*2I+l*2I+2 



1 

KGnifabs) Kg„ {{-)abs) 

9 



a{fYa{-Y. 



2 An 



1 



Tr ( (/C„(/)/C„(-)) - (/C„(/)Q„ 



< \{l + Un~l){il + Ur^r-' + (1 + n„r ' + • • ■ + 1) I a{fra{-r 

< K(2^-l)|a(/r«(V 

9 

9 



This ends the proof of the Lemma. 



□ 



Finally, the following lemma explains the choice of B^'^\ The unbiased quadratic form 
Qn is no more than a correction of the error between ICn{f)K,n{g) and /C„(/5f). 
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Lemma 6.3 (Exact correction). Let f,g& Tp, and assume that either f or g is a polynomial 
of degree less than or equal to P (see Sectionl^. Then, the unbiased quadratic form Qn{fe) 
verify 

Tr(/C„(/)Q„(<?)) = Tr(/C„(/<?)). 

Proof, of Lemma 16.31 
First, notice that 

Tr(/C„(/)Q„(^/))= ICnifhJC^ighBlf . 

Since this expression is symmetric on /, g, we can now consider the case where / is a poly- 
nomial of degree less than or equal to P. 

Actually, since / is a polynomial, }Cn{f)ij = as soon as d{i,j) > P {i,j G G). Then, if 
i,j,k,l G G are such that fiij = jiku we have 

^n{f)ij^n{g)ij = ^nU)kl^n{.g)kl- 



So that, we may here denote, for convenience, K{f) 



Using Assumption I3.4[ this leads to 

Tr(/C„(/)Q„(^)) = ^n(/).,/C„(^),,i?J;) 

= E E /C„(/),/C„((7),S(") 

VdVp ijeGn 

= Y ^n{f)vK^n{g)vGaX<l {{i, j) EGnX Gn, Hij = v} 
veVp 

Caid {{i, j) E Gn y< G,fiij = v} 
Caid{{i,j) eGnX Gn, fJ-ij = v}' 

= E E ICn{f).JCn{g).Bi-^ 

V(^Vp (i,j)eGnXG, 

mj^v,d(^(i,j)<P 

= Y ^nUh^nig^B^ 

That ends the proof of Lemma 16.31 □ 
6.2 Proofs of the lemmas of Theorem 13.11 

Recall that the theorem relies on two lemmas. Lemma l3^ states a condition on deterministic 
sequences to provide the convergence of the maximizer of these sequences. 

Proof, of Lemma 13.21 Recall that fe^ denotes the true spectral density. Let {ln)n&i be a 
deterministic sequence of continuous functions such that 



V^G0,a^o)-CW ^ /f-log(^)-l + ^')d^. 

n^oo 2 J \ fe Je J 



(9) 
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uniformly as n tends to infinity. Denotes moreover 9^ = argmax^ We aim at proving 

that 

Using the compactness of 9, let 6^00 be an accumulation point of the sequence (6'„)„gN, 
and (6'„j,)fcgN be a subsequence converging to 6*00 • As the function 



^^2 



is continuous on 9, and the convergence of (^„(6'o) — C(6'))neN is uniform in 9, we have 

LM-LM^l I -iog(^)-i + ^d/.. (10) 

^ J Jeoo /Coo 

But we can notice that, thanks to the definition of 6'„, iuki^o) ~ ^nS^Uk) ^ So, since the 
function x — log(a;) + x — 1 is non negative and vanishes if, and only if, x = 1, we get 
that = fe^- By injectivity of the function 9 — )■ fe, we get 6*00 = 6*0, for any accumulation 
point 6*00 of the sequence {9n)ne'N, which ends the proof of this first lemma. □ 

Lemma [STT] provides the uniform convergence of the contrasts of maximum likelihood and 
approximated maximum likelihood to the KuUback information. The proof may be cut into 
several lemmas. 

Proof, of Lemma 13.11 

First, notice that by construction, we have, for any 9 E Q, 



«(/,„, /e) = limE 

n 

when it exists. Then, we can compute 



Lifeo,Xn) - ln{fe,Xn) = --^ (log det (/C„ (/g J ) - logdet(;C„(/e))) 

(Xj/C„(/6iJ~^X„ - X^)C„{fe)~^Xn) 



2m„ 

Corollary 16.11 of Lemma 16.11 provides the following convergence 

— (logdet(/C„(/,J)-logdet(K:„(/,))) ^ f\og(^^)d^I. (12) 
rrin J \.te J 

To prove the existence of IK{f0g, fe), it only remains to prove the P/^^-a.s. convergence 

of ^Xj/C„(/e)~^X„ to / as n goes to infinity. 

This is ensured by the following Lemma. 

Lemma 6.4 (Convergence lemma). For respectively A = }Cn{j-), A = {lCn{fe))^^ or A = 

J 

Qnij^), we have, 

— X:AX„ -> / i^df,,¥.- a.s.. 



ri " I r 

rrin J ft 
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Lemma [6.41 combined with Corollary 16.11 ensures the P/^^^ — a.s. convergence of Inifeo) ~ 
Life), Lifeo) - Life) to I]K(/e„, /e). It provides also the P/^^^ - a.s. convergence of lt\feo) - 
ln^\fe) to IKife^, fe) in the ARp or MAp cases (see Section [3]). To complete the assertion 
of Lemma 13. ![ it only remains to show the uniform convergences on of the last quantities. 
This will be done using an equicontinuity argument given by the following Lemma. 

Lemma 6.5 (Equicontinuity lemma). For all n > 0, the sequences of functions 

iLifOoi^n) — Life, Xn)) 

is an Py^^-a.s. equicontinuous sequence on i{fe,0 G 0} , ||. ||^). T/izs property also holds for 
L,L- Furthermore, the sequence ( /i"^(/eo, X„ — /!"■'( /e, is also Ff -a.s. equicontin- 

UOUS, on (^{fe,0e 0} , ||.||i,po/). 

We can now end the proof of Lemma 13.11 

First, notice that the space {/g, 6* G 0} is compact for the topology of the uniform conver- 
gence. This also holds for (^{fe, ^ G 0} , IMIipo^j- So, there exists a dense sequence if9p)peN- 

Then, using Lemma [6A] and Corollary 16. H the sequence [Lifeo, Xn) — Lifep, Xn)),^^^ con- 
verges P/g^-a.s. to IK(/e„,/eJ. 

If a sequence of functions is equicontinuous and converges pointwise on a dense subset of 
its domain, and if its co-domain is a complete space, then the sequence converges pointwise 
on all the domain [20] . 

Using this well known property, we obtain, Pj^^-a.s., the pointwise convergence of 

iLifeo,Xn) — Life,Xn))ne^ 

to IK(/e„,/e),forany^G0. 

Furthermore, if a sequence of functions is equicontinuous and converges pointwise on its 
domain, then this convergence is uniform on any compact subspace of the domain [20] . 

Thus, we get, P/^^-a.s., the uniform convergence on of the sequence 

iLifeo,Xn) — Life, Xn))n(zN 

to mi feo,fe). 

Using the same kind of arguments, this uniform convergence also holds for L,L and /I"''. 
This concludes the proof of Lemma 13.11 

□ 

6.3 Proof of the technical lemmas 

Proof, of Lemma 16.41 

Let 6* G 0. First, consider the case A„ = /C„ (^j^^- We aim at proving that 



rrin J fe ° 



27 



To do that, we make use of classical tools of large deviation (see [9]). We compute the 
Laplace transform of XjA„X„ : 



(v^)™Vdet(/C„(/eJ) 
1 



v/det(/C„(/,J)\ 



det 



Je 



1 



det - 2A/C„(/,J5/C„,(^)/C„,(/,J^ 



These last equalities hold as soon as /g„ — 2A/C„(/eg)5/C„(-^)/C„(/5i„)^ is positive. This is 
true whenever A < or small enough. 
Now, for A < 0, define 



(/.„(A) := log(Ei 



fen 



This function verifies 

0n(A) 

Define also 



2m, 



^ log det fjG„ - 2A/C„(/,J^/c„(l)/C,(/,J 



7e' 



0(A) = lim0„(A), 

n 

We get, using Corollary 16.11 

^(A)^-i/,og(l-2A|). 

We can also compute 

r 2i^Y 

0"(A) = / d/i > 0. 

J (i_2A^)2 

As very usual, we define the convex conjugate of by 

(/)*(t) := sup [At - (/)(A)] , t e M. 
AeM- 

As soon as (j) is strictly convex, 0*(t) > 0(0) = 0, for any t ^ 0'(O) = / -d/i. 
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We can now write, for A < 0, 



— log(P(— XjA„X„ > t)) = ^ log(P(e^^" > e*""^*)) 



< — log (e"'""^*) + ^ log f E[ 

< -Xt + (j)niX). 



Then we get, \/t > j -dfi, 



limsup ( — log(P(— XjA„X„ >t))] < -Xt + 0(A) 
V m„ rrin 



So that, taking the infimum on A, we get 

1 . 1 



lim sup 



log(P(— XjA„X„ >t))] < < 



We can obtain the same bound for t < J ^dfx. By Borel-Cantelli theorem, we get the 
P/g^-almost sure convergence of ^XjA„X„ to / -d/i. To prove the same convergence with 
An = {ICn{fe))~^ , we have to show that the difference between the spectral empirical measure 
of }Cn{f0o)^ICn{j^)JCn{feo)^ and /C„(/eo) 2/C„(/e)-i/C„(/e converges weakly to zero. It is 
sufficient to control the convergence of every moment, because these two last measures both 
have compact support. 

For this, we make use of the Schatten norms. For any A,B matrices of Mm„(ffi), we 
define 



1^41 



Sch,p 



where Sk{A) are the singular values of A. 
Note that 



\TriAB)\<\\AB\\,,,<\\A\ 



SchA 



\B\ 



Sch,co ' 



Recall that since fe & J^p, we have e < fe < e'' . Hence, for any p > I, 



1 



Tr(/C^(-)/C^(/,J-/C,-^(/,)/C^(/,,; 
Je 



< 



\jCnifor'JC'M\ 



Sch,oo 



< 



K{-Q)KUe) - Ig. 



SchA 



To obtain the same bound with A„ = we have to prove that the difference between 

J 6 

the spectral empirical measures of /Cn(/eJ 2/C„(-l)/C„(/eJ 2 and /C„(/eJ 2 Q„(-^)/C„(/e„)2 
converge weakly to zero. This last assertion is a direct consequence of Lemma 16.21 So, 
we get 



fo 



□ 
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Proof, of Lemma 16.51 

Recall that we aim at proving that, P/g^-a.s., the sequence of functions 



neN 



is equicontinuous on {/g, 6 G 6}, and that this property also holds for and / 
First, we will prove the equicontinuity of the sequence 



(n) 
n ■ 



nir. 



■logdet(/C„(/e)) 



n6N 



Let e, 6' e 0. 

Denote Aj the eigenvalues of lCn{fe')^^ {^n{fe') — ^nife))- Since fe G J-'p, we have e~P < 
fe < e". 

Notice that we have 



sup |A,| = \\lC^{fe'V (J^nife') - /C„(/,))| 

1=1, ■■■ ,n 

<e'' life' -ML- 



2,op 



So that, to prove the equicontinuity, we may assume that 9 is close enough to 9' to ensure 
that supj^i ... |Aj| < |. 



We have 
1 



logdet(/C„(/eO)-logdet(/C„(/e)) 





1 




m„ 
1 


< 






m„ 




1 


< 









< 21og(2) sup I Ail 

ieGn 

<2\ogi2)e^\\U-fe\L. 

Furthermore, the sequence (J log(/e)d^).„gN is also equicontinuous since, using a Taylor 
formula, 

' <en\fe'-fe\L. 



log(/e')d/i- y log(/e)d/i 
Now we tackle the equicontinuity of the sequences 



Je 



neN 
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and 



fe 



neN 



Notice first that, for any matrix B G M„ 



1 



rur. 



\XnBXn\ < — ||5||2_^p |X„X„| 



It is thus sufficient to prove the equicontinuity of the sequences 

neN) 



(^n(-^))neN 

Je 



and 

for the norm ||.| 
Note that 



2,op 



Je' Je 



< 



2,op 



1 1 

fe' fe 



<e''\\fg>-fe\ 



Then, 



(/C„(/,0)-' - {}Cn{fe)r%^,^ < \\{}C4fe')r\lCMe))-%_^„ ||(/C„(/,0) - (/C„(/.))|| 



2,op 



\2,op 



<e'n\fe'-fe\L. 
Then, recall that, for any symmetric matrix B G M„(]R), we have 

\\B\\2,op<\\B\L,op- 

Recall also that Qn{fe) = Knife)- Denote 



Je' Je 



< 



2,op 



Je' Je 



oo,op 



< sup 

i,j=l,---n 

< (1 + Mn) 



Je' Jt 



1 1 

fe' fe 



oo,op 

see Assumption 13. 5p . 



l,pol 



Since the map i— ?■ -t- is continuous over J^p, which is compact, we get the uniform 



equicontinuity of the map fg (for the norm 

This concludes the proof of Lemma 16.51 



□ 
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Proof, of Lemma 13.31 

We aim at proving the asymptotic normality of ^m„(/i"'*)'(^o)- 
Using the Fourier transform, it is sufficient to prove that 

hniE [exp (zv^t ((4"^)'(^o)))] = exp J ^t'^-i^{t)dfi{t) 

Recall that we have 

(e)m^4/fd,+J^ATa„(|).Y„. 

We can compute 



^ ("^/ 7^"^^^ 2^^' (-^'"f))) Lemma El 



< Cvn^frn^ -T- (see Assumption I3.6p . 



If we define 



and 

the last equality means that 

v/^(E -Z) ^0. 

This holds only if /g^ is a polynomial, or if all the /e, G O are polynomials. This brings 
out that the second theorem holds for the ARp or MAp case. It also explains the term 
'unbiased estimator' used for 6^^\ 

Then, it is sufficient to show 



limE [exp {i^m^{Zn - E = exp j ^ 

If Tfc denotes the eigenvalues of the symmetric matrix 



flit) 



M„:=|/C„(/eo)^Q„(^)/C„(/, 



then we can write 



where (Yk)k(zG„ has the standard Gaussian distribution on 
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The independence of leads to 

log (E [exp (2^^ (Z„ - E [Z„]))]) = -J2 



k=l 



+ -log(l - 2z— = 



The Tfc are bounded, thanks to the following inequality: 



^/C„(/,JtQ„(:^)/C„(/,„; 



'00 



< 



< 



2,op 



2,op 
2 



f 

n\ r2 ' 
f 

f2 ' 
ho 

The Taylor expansion of log(l — 2^=) gives 



2,op 



2,op 



l,op 



00 



2,op 



2,op 



^ m„ 

log (E [exp (Z„ - E [Z„]))]) = J^T-k+Rn- 



k=l 



With < V"" IrfcP 

I "I — mny/rn^ Z-^k=l I "I 

Since the are bounded the assertion will be proved if we show that 



1 1 r 1 

— Tr(M„^) = — E-^'"^ / 

fell >^ ^ 



flit) 



dfiit). 



This last convergence is a consequence of Lemmas 16.11 and 16. 2[ 

This provides the asymptotic normality of ^/rfl^{ll^^y{6o) and concludes the proof of 
Lemma 13.31 ^ 

v^(e))'(^o) ^ [ (§A d/i). 



□ 



Proof, of Lemma 13.41 

We aim now at proving the P/^^-a.s. following convergence: 



-1 1 



We have 



2m^ 



h 



W - /^70 
/I 
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which leads to 

1 r f f'Jf.-( fa. (2( f!.)^ - f'l fn\ \ 



' <"> -^--2 7 — Ti — ^ Tl 



?1— ^-oo 



Since the sequence /i""* is equicontinuous and On — ?■ 6*0, we obtain the desired convergence : 



do 

□ 

Proof, of Lemma 13.51 

We want to compute the asymptotic Fisher information. As usual, it is sufficient to 
compute 

— Var (L:,(^o)) = lim-^Tr(M„(^o)'), 
rrin " 2,mn 

where = /C„(/e)-^/C„,(/^)/C„(/,)-i/C„(/,J. 

This leads, together with Lemma 16. and Assumption 13.31 to 



^Var(L;(^o))^^ / ^^d^- 
This ends the proof of the last lemma. □ 
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